Robust Speech Recognition Using Perceptual Wavelet Denoising and Mel-frequency Product Spectrum Cepstral Coefficient Features

نویسندگان

  • Mohamed Cherif Amara Korba
  • Djemil Messadeg
  • Rafik Djemili
  • Hocine Bourouba
چکیده

To improve the performance of Automatic Speech Recognition (ASR) Systems, a new method is proposed to extract features capable of operating at a very low signal-to-noise ratio (SNR). The basic idea introduced in this article is to enhance speech quality as the first stage for Mel-cepstra based recognition systems, since it is well-known that cepstral coefficients provided better performance in clean environment. In this speech enhancement stage, the noise robustness is improved by the perceptual wavelet packet (PWP) based denoising algorithm with both type of thresholding procedure, soft and modified soft thresholding procedure. A penalized threshold was selected. The next stage of the proposed method is extract feature, it is performed by the use of Mel-frequency product spectrum cepstral coefficients (MFPSCCs) introduced by D. Zhu and K.K and Paliwal in [2]. The Hidden Markov Model Toolkit (HTK) was used throughout our experiments, which were conducted for various noise types provided by noisex-92 database at different SNRs. Comparison of the proposed approach with the MFCC-based conventional (baseline) feature extraction method shows that the proposed method improves recognition accuracy rate by 44.71 %, with an average value of 14.80 % computed on 7 SNR level for white Gaussian noise conditions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DWT and LPC based feature extraction methods for isolated word recognition

In this article, new feature extraction methods, which utilize wavelet decomposition and reduced order linear predictive coding (LPC) coefficients, have been proposed for speech recognition. The coefficients have been derived from the speech frames decomposed using discrete wavelet transform. LPC coefficients derived from subband decomposition (abbreviated as WLPC) of speech frame provide bette...

متن کامل

Regularized minimum variance distortionless response-based cepstral features for robust continuous speech recognition

In this paper, we present robust feature extractors that incorporate a regularized minimum variance distortionless response (RMVDR) spectrum estimator instead of the discrete Fourier transform-based direct spectrum estimator, used in many front-ends including the conventional MFCC, to estimate the speech power spectrum. Direct spectrum estimators, e.g., single tapered periodogram, have high var...

متن کامل

Cepstrum derived from differentiated power spectrum for robust speech recognition

In this paper, cepstral features derived from the differential power spectrum (DPS) are proposed for improving the robustness of a speech recognizer in presence of background noise. These robust features are computed from the speech signal of a given frame through the following four steps. First, the short-time power spectrum of speech signal is computed from the speech signal through the fast ...

متن کامل

Robust Speech Recognition Using Wavelet Coefficient Features

We propose a new vein of feature vectors for robust speech recognition that use denoised wavelet coefficients. Greater robustness to unexpected additive noise or spectrum distortions begins with more robust acoustic features. The use of wavelet coefficients is motivated by human acoustic process modelling and by the ability of wavelet coefficients to capture important time and frequency feature...

متن کامل

Feature extraction from higher-lag autocorrelation coefficients for robust speech recognition

In this paper, a feature extraction method that is robust to additive background noise is proposed for automatic speech recognition. Since the background noise corrupts the autocorrelation coefficients of the speech signal mostly at the lowertime lags, while the higher-lag autocorrelation coefficients are least affected, this method discards the lower-lag autocorrelation coefficients and uses o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Informatica (Slovenia)

دوره 32  شماره 

صفحات  -

تاریخ انتشار 2008